1,473 research outputs found

    Measuring the influence of concept detection on video retrieval

    Get PDF
    There is an increasing emphasis on including semantic concept detection as part of video retrieval. This represents a modality for retrieval quite different from metadata-based and keyframe similarity-based approaches. One of the premises on which the success of this is based, is that good quality detection is available in order to guarantee retrieval quality. But how good does the feature detection actually need to be? Is it possible to achieve good retrieval quality, even with poor quality concept detection and if so then what is the 'tipping point' below which detection accuracy proves not to be beneficial? In this paper we explore this question using a collection of rushes video where we artificially vary the quality of detection of semantic features and we study the impact on the resulting retrieval. Our results show that the impact of improving or degrading performance of concept detectors is not directly reflected as retrieval performance and this raises interesting questions about how accurate concept detection really needs to be

    People detection based on appearance and motion models

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. A. Garcia-Martin, A. Hauptmann, and J. M. Martínez "People detection based on appearance and motion models", in 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2011, p. 256-260The main contribution of this paper is a new people detection algorithm based on motion information. The algorithm builds a people motion model based on the Implicit Shape Model (ISM) Framework and the MoSIFT descriptor. We also propose a detection system that integrates appearance, motion and tracking information. Experimental results over sequences extracted from the TRECVID dataset show that our new people motion detector produces results comparable to the state of the art and that the proposed multimodal fusion system improves the obtained results combining the three information sources.This work has been partially supported by the Cátedra UAM-Infoglobal ("Nuevas tecnologías de vídeo aplicadas a sistemas de video-seguridad") and by the Universidad Autónoma de Madrid (“FPI-UAM: Programa propio de ayudas para la Formación de Personal Investigador”

    Exploring semantic inter-class relationships (SIR) for zero-shot action recognition

    Full text link
    © Copyright 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Automatically recognizing a large number of action categories from videos is of significant importance for video understanding. Most existing works focused on the design of more discriminative feature representation, and have achieved promising results when the positive samples are enough. However, very limited efforts were spent on recognizing a novel action without any positive exemplars, which is often the case in the real settings due to the large amount of action classes and the users' queries dramatic variations. To address this issue, we propose to perform action recognition when no positive exemplars of that class are provided, which is often known as the zero-shot learning. Different from other zero-shot learning approaches, which exploit attributes as the intermediate layer for the knowledge transfer, our main contribution is SIR, which directly leverages the semantic inter-class relationships between the known and unknown actions followed by label transfer learning. The inter-class semantic relationships are automatically measured by continuous word vectors, which learned by the skip-gram model using the large-scale text corpus. Extensive experiments on the UCF101 dataset validate the superiority of our method over fully-supervised approaches using few positive exemplars

    A model-based iterative learning approach for diffuse optical tomography

    Get PDF
    Diffuse optical tomography (DOT) utilises near-infrared light for imaging spatially distributed optical parameters, typically the absorption and scattering coefficients. The image reconstruction problem of DOT is an ill-posed inverse problem, due to the non-linear light propagation in tissues and limited boundary measurements. The ill-posedness means that the image reconstruction is sensitive to measurement and modelling errors. The Bayesian approach for the inverse problem of DOT offers the possibility of incorporating prior information about the unknowns, rendering the problem less ill-posed. It also allows marginalisation of modelling errors utilising the so-called Bayesian approximation error method. A more recent trend in image reconstruction techniques is the use of deep learning, which has shown promising results in various applications from image processing to tomographic reconstructions. In this work, we study the non-linear DOT inverse problem of estimating the (absolute) absorption and scattering coefficients utilising a ‘model-based’ learning approach, essentially intertwining learned components with the model equations of DOT. The proposed approach was validated with 2D simulations and 3D experimental data. We demonstrated improved absorption and scattering estimates for targets with a mix of smooth and sharp image features, implying that the proposed approach could learn image features that are difficult to model using standard Gaussian priors. Furthermore, it was shown that the approach can be utilised in compensating for modelling errors due to coarse discretisation enabling computationally efficient solutions. Overall, the approach provided improved computation times compared to a standard Gauss-Newton iteration

    Supporting High-Uncertainty Decisions through AI and Logic-Style Explanations

    Get PDF
    A common criteria for Explainable AI (XAI) is to support users in establishing appropriate trust in the AI - rejecting advice when it is incorrect, and accepting advice when it is correct. Previous findings suggest that explanations can cause an over-reliance on AI (overly accepting advice). Explanations that evoke appropriate trust are even more challenging for decision-making tasks that are difficult for humans and AI. For this reason, we study decision-making by non-experts in the high-uncertainty domain of stock trading. We compare the effectiveness of three different explanation styles (influenced by inductive, abductive, and deductive reasoning) and the role of AI confidence in terms of a) the users' reliance on the XAI interface elements (charts with indicators, AI prediction, explanation), b) the correctness of the decision (task performance), and c) the agreement with the AI's prediction. In contrast to previous work, we look at interactions between different aspects of decision-making, including AI correctness, and the combined effects of AI confidence and explanations styles. Our results show that specific explanation styles (abductive and deductive) improve the user's task performance in the case of high AI confidence compared to inductive explanations. In other words, these styles of explanations were able to invoke correct decisions (for both positive and negative decisions) when the system was certain. In such a condition, the agreement between the user's decision and the AI prediction confirms this finding, highlighting a significant agreement increase when the AI is correct. This suggests that both explanation styles are suitable for evoking appropriate trust in a confident AI. Our findings further indicate a need to consider AI confidence as a criterion for including or excluding explanations from AI interfaces. In addition, this paper highlights the importance of carefully selecting an explanation style according to the characteristics of the task and data

    Interference of a first-order transition with the formation of a spin-Peierls state in alpha'-NaV2O5?

    Full text link
    We present results of high-resolution thermal-expansion and specific-heat measurements on single crystalline alpha'-NaV2O5. We find clear evidence for two almost degenerate phase transitions associated with the formation of the dimerized state around 33K: A sharp first-order transition at T1=(33+-0.1)K slightly below the onset of a second-order transition at T2onset around (34+-0.1)K. The latter is accompanied by pronounced spontaneous strains. Our results are consistent with a structural transformation at T1 induced by the incipient spin-Peierls (SP) order parameter above T2=TSP.Comment: 5 pages, 7 figure

    Automatic vacant parking places management system using multicamera vehicle detection

    Full text link
    This paper presents a multicamera system for vehicles detection and their corresponding mapping into the parking spots of a parking lot. Approaches from the state-of-the-art, which work properly in controlled scenarios, have been validated using small amount of sequences and without more challenging realistic conditions (illumniation changes, different weather). On the other hand, most of them are not complete systems, but provide only parts of them, usually detectors. The proposed system has been designed for realistic scenarios considering different cases of occlussion, ilumination changes and different climatic conditions; a real scenario (the International Pittsburgh Airport parking lot) has been targeted with the condition that existing parking security cameras can be used, avoiding the deployment of new cameras or other sensors infrastructures. For design and validation, a new multicamera dataset has been recorded. The system is based on existing object detectors (the results of two of them are shown) and different proposed postprocessing stages. The results clearly show that the proposed system works correctly in challenging scenarios including almost total occlusions, illumination changes and different weather conditionsThis work has been partially supported by the Spanish Government FPU grant programme (Ministerio de Educación, Cultura y Deporte) and by the Spanish government under the project TEC2014-53176-R (HAVideo

    STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition

    Full text link
    We study the problem of human action recognition using motion capture (MoCap) sequences. Unlike existing techniques that take multiple manual steps to derive standardized skeleton representations as model input, we propose a novel Spatial-Temporal Mesh Transformer (STMT) to directly model the mesh sequences. The model uses a hierarchical transformer with intra-frame off-set attention and inter-frame self-attention. The attention mechanism allows the model to freely attend between any two vertex patches to learn non-local relationships in the spatial-temporal domain. Masked vertex modeling and future frame prediction are used as two self-supervised tasks to fully activate the bi-directional and auto-regressive attention in our hierarchical transformer. The proposed method achieves state-of-the-art performance compared to skeleton-based and point-cloud-based models on common MoCap benchmarks. Code is available at https://github.com/zgzxy001/STMT.Comment: CVPR 202

    Magnetic Resonance in the Spin-Peierls compound αNaV2O5\alpha'-NaV_2O_5

    Full text link
    We present results from magnetic resonance measurements for 75-350 GHz in α\alpha'-NaV2_{2}O5_{5}. The temperature dependence of the integrated intensity indicates that we observe transitions in the excited state. A quantitative description gives resonances in the triplet state at high symmetry points of the excitation spectrum of this Spin-Peierls compound. This energy has the same temperature dependence as the Spin-Peierls gap. Similarities and differences with the other inorganic compound CuGeO3_{3} are discussed.Comment: 2 pages, REVTEX, 3 figures. to be published in Phys.Rev.
    corecore